Diagnose log injection smoke test flakiness instead of masking it by bm1549 · Pull Request #11075 · DataDog/dd-trace-java

bm1549 · 2026-04-09T18:33:13Z

What Does This Do

Adds diagnostic instrumentation to the check raw file injection smoke test so the next CI failure tells us the root cause instead of a bare "Condition not satisfied after 30s" with traceCount=0.

Changes to LogInjectionSmokeTest:

waitForTraceCountAlive — checks process liveness on every poll iteration; if the process dies, fails immediately with exit code + last 20 lines of process output
Enriched timeout errors — on timeout, dumps: process alive?, traceCount, RC polls received, last 30 lines of process output
Reorder waitForTraceCount(4) before waitFor + assert waitFor return value

Motivation

CI Visibility data for the last 30 days on master shows 10 failures of check raw file injection:

Failure mode	Count	Line	Duration	Root cause
`traceCount=0` at `waitForTraceCount(2)`	9/10	368	30.3s	Unknown — no diagnostics
`logLines.size()=3` at `assertRawLogLinesWithInjection`	1/10	229	8.3s	Incomplete log file

The failure distribution is bimodal — successful runs complete in 3.5-8.7s (80 data points, zero above 9s), while failures sit at exactly 30.3s. There is nothing in between. This means the process either works or is totally broken — a timeout increase would just delay the same failure.

<9s:  ████████████████████████████████████████  80/80 passes
9-30s:                                           0 runs
30s:  █████████                                  9/10 failures (at timeout)

The current test is blind during the wait — it just polls traceCount in a loop. We don't know if the process crashed, hung during agent init, failed to connect to the test server, or something else entirely. This PR makes the next failure self-diagnosing.

Example output when process crashes:

Process exited with code 1 while waiting for 2 traces (received 0, RC polls: 3).
Last process output:
[dd.trace ...] ERROR ... NullPointerException during instrumentation
...

Example output on timeout (process alive but not sending traces):

Timed out waiting for 2 traces after 30s. traceCount=0, process.alive=true, RC polls received: 142.
Last process output:
[dd.trace ...] DEBUG ... Still loading instrumentations...
...

Additional Notes

Only LogInjectionSmokeTest.groovy is changed
No timeout increase — the 30s defaultPoll is kept as-is
All 11 historically flaky backends pass locally
rcClientMessages.size() tells us whether the agent connected to the test server at all (RC polls hit /v0.7/config every 200ms)

Contributor Checklist

Format the title according to the contribution guidelines
Assign the type: and (comp: or inst:) labels
Avoid using close, fix, or any linking keywords when referencing an issue
Update the CODEOWNERS file on source file addition, migration, or deletion — N/A (no file additions)
Update public documentation with any new configuration flags or behaviors — N/A (test-only change)

tag: no release notes
tag: ai generated

🤖 Generated with Claude Code

The `check raw file injection` test has been flaking across 11+ logging backend variants for months. CI Visibility data shows 90% of failures are `traceCount=0` at `waitForTraceCount(2)` after exactly 30s — the JVM + agent bytecode instrumentation simply takes >30s on overloaded CI machines. Changes: - Add `startupPoll` with 120s timeout for the initial `waitForTraceCount(2)` that covers JVM startup + agent init, giving 4x headroom over the current 30s `defaultPoll` - Add `waitForTraceCountAlive` that checks process liveness on each poll iteration, turning silent 30-120s timeouts into instant, actionable errors when the process crashes - Reorder `waitForTraceCount(4)` before `waitFor` to confirm all traces are delivered while the process is still alive - Assert `waitFor` return value for a clear error if the process hangs tag: no release note Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

pr-commenter · 2026-04-09T19:20:03Z

Benchmarks

Startup

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	brian.marks/fix-log-injection-smoke-test-flake
git_commit_date	1775744045	1775836958
git_commit_sha	`b266e2d`	`9eb11aa`
release_version	1.62.0-SNAPSHOT~b266e2d0c2	1.62.0-SNAPSHOT~9eb11aac61

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1775838770	1775838770
ci_job_id	1586148684	1586148684
ci_pipeline_id	107145233	107145233
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-0-xowwlhrv 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-0-xowwlhrv 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module	Agent	Agent
parent	None	None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 58 metrics, 13 unstable metrics.

Startup time reports for petclinic

gantt
    title petclinic - global startup overhead: candidate=1.62.0-SNAPSHOT~9eb11aac61, baseline=1.62.0-SNAPSHOT~b266e2d0c2

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.063 s) : 0, 1063231
Total [baseline] (11.036 s) : 0, 11036293
Agent [candidate] (1.059 s) : 0, 1059136
Total [candidate] (11.063 s) : 0, 11063171
section appsec
Agent [baseline] (1.258 s) : 0, 1258131
Total [baseline] (11.191 s) : 0, 11191006
Agent [candidate] (1.248 s) : 0, 1247667
Total [candidate] (11.227 s) : 0, 11227484
section iast
Agent [baseline] (1.223 s) : 0, 1223111
Total [baseline] (11.405 s) : 0, 11404977
Agent [candidate] (1.232 s) : 0, 1232189
Total [candidate] (11.365 s) : 0, 11365187
section profiling
Agent [baseline] (1.186 s) : 0, 1185901
Total [baseline] (11.17 s) : 0, 11169917
Agent [candidate] (1.187 s) : 0, 1186759
Total [candidate] (11.109 s) : 0, 11109348

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.063 s	-
Agent	appsec	1.258 s	194.9 ms (18.3%)
Agent	iast	1.223 s	159.88 ms (15.0%)
Agent	profiling	1.186 s	122.67 ms (11.5%)
Total	tracing	11.036 s	-
Total	appsec	11.191 s	154.713 ms (1.4%)
Total	iast	11.405 s	368.684 ms (3.3%)
Total	profiling	11.17 s	133.623 ms (1.2%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.059 s	-
Agent	appsec	1.248 s	188.531 ms (17.8%)
Agent	iast	1.232 s	173.053 ms (16.3%)
Agent	profiling	1.187 s	127.623 ms (12.0%)
Total	tracing	11.063 s	-
Total	appsec	11.227 s	164.314 ms (1.5%)
Total	iast	11.365 s	302.017 ms (2.7%)
Total	profiling	11.109 s	46.178 ms (0.4%)

gantt
    title petclinic - break down per module: candidate=1.62.0-SNAPSHOT~9eb11aac61, baseline=1.62.0-SNAPSHOT~b266e2d0c2

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.246 ms) : 0, 1246
crashtracking [candidate] (1.222 ms) : 0, 1222
BytebuddyAgent [baseline] (634.887 ms) : 0, 634887
BytebuddyAgent [candidate] (634.371 ms) : 0, 634371
AgentMeter [baseline] (29.693 ms) : 0, 29693
AgentMeter [candidate] (29.438 ms) : 0, 29438
GlobalTracer [baseline] (250.611 ms) : 0, 250611
GlobalTracer [candidate] (248.879 ms) : 0, 248879
AppSec [baseline] (32.273 ms) : 0, 32273
AppSec [candidate] (32.071 ms) : 0, 32071
Debugger [baseline] (60.375 ms) : 0, 60375
Debugger [candidate] (60.007 ms) : 0, 60007
Remote Config [baseline] (618.742 µs) : 0, 619
Remote Config [candidate] (601.417 µs) : 0, 601
Telemetry [baseline] (8.238 ms) : 0, 8238
Telemetry [candidate] (8.048 ms) : 0, 8048
Flare Poller [baseline] (9.016 ms) : 0, 9016
Flare Poller [candidate] (8.291 ms) : 0, 8291
section appsec
crashtracking [baseline] (1.24 ms) : 0, 1240
crashtracking [candidate] (1.216 ms) : 0, 1216
BytebuddyAgent [baseline] (668.022 ms) : 0, 668022
BytebuddyAgent [candidate] (661.304 ms) : 0, 661304
AgentMeter [baseline] (12.143 ms) : 0, 12143
AgentMeter [candidate] (12.001 ms) : 0, 12001
GlobalTracer [baseline] (250.991 ms) : 0, 250991
GlobalTracer [candidate] (248.86 ms) : 0, 248860
IAST [baseline] (24.769 ms) : 0, 24769
IAST [candidate] (24.581 ms) : 0, 24581
AppSec [baseline] (185.223 ms) : 0, 185223
AppSec [candidate] (184.798 ms) : 0, 184798
Debugger [baseline] (66.316 ms) : 0, 66316
Debugger [candidate] (65.923 ms) : 0, 65923
Remote Config [baseline] (635.029 µs) : 0, 635
Remote Config [candidate] (602.889 µs) : 0, 603
Telemetry [baseline] (8.695 ms) : 0, 8695
Telemetry [candidate] (8.52 ms) : 0, 8520
Flare Poller [baseline] (3.575 ms) : 0, 3575
Flare Poller [candidate] (3.564 ms) : 0, 3564
section iast
crashtracking [baseline] (1.236 ms) : 0, 1236
crashtracking [candidate] (1.227 ms) : 0, 1227
BytebuddyAgent [baseline] (800.829 ms) : 0, 800829
BytebuddyAgent [candidate] (807.248 ms) : 0, 807248
AgentMeter [baseline] (11.353 ms) : 0, 11353
AgentMeter [candidate] (11.527 ms) : 0, 11527
GlobalTracer [baseline] (238.852 ms) : 0, 238852
GlobalTracer [candidate] (240.599 ms) : 0, 240599
IAST [baseline] (25.717 ms) : 0, 25717
IAST [candidate] (25.92 ms) : 0, 25920
AppSec [baseline] (31.724 ms) : 0, 31724
AppSec [candidate] (33.66 ms) : 0, 33660
Debugger [baseline] (58.496 ms) : 0, 58496
Debugger [candidate] (58.713 ms) : 0, 58713
Remote Config [baseline] (1.125 ms) : 0, 1125
Remote Config [candidate] (528.187 µs) : 0, 528
Telemetry [baseline] (13.995 ms) : 0, 13995
Telemetry [candidate] (12.749 ms) : 0, 12749
Flare Poller [baseline] (3.446 ms) : 0, 3446
Flare Poller [candidate] (3.605 ms) : 0, 3605
section profiling
crashtracking [baseline] (1.188 ms) : 0, 1188
crashtracking [candidate] (1.174 ms) : 0, 1174
BytebuddyAgent [baseline] (692.66 ms) : 0, 692660
BytebuddyAgent [candidate] (691.408 ms) : 0, 691408
AgentMeter [baseline] (9.111 ms) : 0, 9111
AgentMeter [candidate] (9.156 ms) : 0, 9156
GlobalTracer [baseline] (206.735 ms) : 0, 206735
GlobalTracer [candidate] (208.067 ms) : 0, 208067
AppSec [baseline] (32.373 ms) : 0, 32373
AppSec [candidate] (32.877 ms) : 0, 32877
Debugger [baseline] (65.748 ms) : 0, 65748
Debugger [candidate] (66.024 ms) : 0, 66024
Remote Config [baseline] (569.316 µs) : 0, 569
Remote Config [candidate] (577.357 µs) : 0, 577
Telemetry [baseline] (7.916 ms) : 0, 7916
Telemetry [candidate] (7.924 ms) : 0, 7924
Flare Poller [baseline] (3.613 ms) : 0, 3613
Flare Poller [candidate] (3.562 ms) : 0, 3562
ProfilingAgent [baseline] (94.466 ms) : 0, 94466
ProfilingAgent [candidate] (94.681 ms) : 0, 94681
Profiling [baseline] (95.042 ms) : 0, 95042
Profiling [candidate] (95.248 ms) : 0, 95248

Startup time reports for insecure-bank

gantt
    title insecure-bank - global startup overhead: candidate=1.62.0-SNAPSHOT~9eb11aac61, baseline=1.62.0-SNAPSHOT~b266e2d0c2

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.058 s) : 0, 1058032
Total [baseline] (8.851 s) : 0, 8851483
Agent [candidate] (1.073 s) : 0, 1072922
Total [candidate] (8.882 s) : 0, 8881549
section iast
Agent [baseline] (1.232 s) : 0, 1231661
Total [baseline] (9.575 s) : 0, 9574595
Agent [candidate] (1.223 s) : 0, 1222672
Total [candidate] (9.556 s) : 0, 9556242

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.058 s	-
Agent	iast	1.232 s	173.63 ms (16.4%)
Total	tracing	8.851 s	-
Total	iast	9.575 s	723.112 ms (8.2%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.073 s	-
Agent	iast	1.223 s	149.75 ms (14.0%)
Total	tracing	8.882 s	-
Total	iast	9.556 s	674.693 ms (7.6%)

gantt
    title insecure-bank - break down per module: candidate=1.62.0-SNAPSHOT~9eb11aac61, baseline=1.62.0-SNAPSHOT~b266e2d0c2

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.236 ms) : 0, 1236
crashtracking [candidate] (1.251 ms) : 0, 1251
BytebuddyAgent [baseline] (632.931 ms) : 0, 632931
BytebuddyAgent [candidate] (643.785 ms) : 0, 643785
AgentMeter [baseline] (29.323 ms) : 0, 29323
AgentMeter [candidate] (29.811 ms) : 0, 29811
GlobalTracer [baseline] (248.94 ms) : 0, 248940
GlobalTracer [candidate] (252.393 ms) : 0, 252393
AppSec [baseline] (32.067 ms) : 0, 32067
AppSec [candidate] (32.692 ms) : 0, 32692
Debugger [baseline] (59.086 ms) : 0, 59086
Debugger [candidate] (60.144 ms) : 0, 60144
Remote Config [baseline] (599.948 µs) : 0, 600
Remote Config [candidate] (612.878 µs) : 0, 613
Telemetry [baseline] (8.057 ms) : 0, 8057
Telemetry [candidate] (8.309 ms) : 0, 8309
Flare Poller [baseline] (9.633 ms) : 0, 9633
Flare Poller [candidate] (7.433 ms) : 0, 7433
section iast
crashtracking [baseline] (1.239 ms) : 0, 1239
crashtracking [candidate] (1.221 ms) : 0, 1221
BytebuddyAgent [baseline] (809.019 ms) : 0, 809019
BytebuddyAgent [candidate] (801.161 ms) : 0, 801161
AgentMeter [baseline] (11.504 ms) : 0, 11504
AgentMeter [candidate] (11.372 ms) : 0, 11372
GlobalTracer [baseline] (239.112 ms) : 0, 239112
GlobalTracer [candidate] (239.026 ms) : 0, 239026
IAST [baseline] (25.766 ms) : 0, 25766
IAST [candidate] (25.794 ms) : 0, 25794
AppSec [baseline] (31.104 ms) : 0, 31104
AppSec [candidate] (30.055 ms) : 0, 30055
Debugger [baseline] (61.008 ms) : 0, 61008
Debugger [candidate] (61.004 ms) : 0, 61004
Remote Config [baseline] (1.13 ms) : 0, 1130
Remote Config [candidate] (532.636 µs) : 0, 533
Telemetry [baseline] (11.924 ms) : 0, 11924
Telemetry [candidate] (12.222 ms) : 0, 12222
Flare Poller [baseline] (3.431 ms) : 0, 3431
Flare Poller [candidate] (3.708 ms) : 0, 3708

Load

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	brian.marks/fix-log-injection-smoke-test-flake
git_commit_date	1775744045	1775836958
git_commit_sha	`b266e2d`	`9eb11aa`
release_version	1.62.0-SNAPSHOT~b266e2d0c2	1.62.0-SNAPSHOT~9eb11aac61

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1775839240	1775839240
ci_job_id	1586148686	1586148686
ci_pipeline_id	107145233	107145233
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-0-unchtyed 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-0-unchtyed 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 2 performance improvements and 3 performance regressions! Performance is the same for 16 metrics, 15 unstable metrics.

scenario	Δ mean agg_http_req_duration_p50	Δ mean agg_http_req_duration_p95	Δ mean throughput	candidate mean agg_http_req_duration_p50	candidate mean agg_http_req_duration_p95	candidate mean throughput	baseline mean agg_http_req_duration_p50	baseline mean agg_http_req_duration_p95	baseline mean throughput
scenario:load:insecure-bank:iast:high_load	worse [+55.894µs; +144.728µs] or [+2.199%; +5.694%]	unsure [+100.066µs; +571.357µs] or [+1.340%; +7.650%]	unstable [-214.047op/s; +91.422op/s] or [-15.284%; +6.528%]	2.642ms	7.805ms	1339.125op/s	2.542ms	7.469ms	1400.438op/s
scenario:load:petclinic:tracing:high_load	better [-1429.334µs; -448.428µs] or [-7.740%; -2.428%]	better [-2.240ms; -0.853ms] or [-7.478%; -2.848%]	unstable [-16.288op/s; +40.788op/s] or [-6.540%; +16.379%]	17.528ms	28.411ms	261.281op/s	18.466ms	29.958ms	249.031op/s
scenario:load:petclinic:profiling:high_load	worse [+1.664ms; +2.338ms] or [+9.096%; +12.780%]	worse [+1.340ms; +2.816ms] or [+4.498%; +9.455%]	unstable [-49.213op/s; +3.901op/s] or [-19.651%; +1.558%]	20.294ms	31.867ms	227.781op/s	18.294ms	29.789ms	250.438op/s

Request duration reports for petclinic

gantt
    title petclinic - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~9eb11aac61, baseline=1.62.0-SNAPSHOT~b266e2d0c2
    dateFormat X
    axisFormat %s
section baseline
no_agent (19.387 ms) : 19191, 19582
.   : milestone, 19387,
appsec (18.72 ms) : 18531, 18908
.   : milestone, 18720,
code_origins (17.711 ms) : 17535, 17888
.   : milestone, 17711,
iast (17.956 ms) : 17778, 18134
.   : milestone, 17956,
profiling (18.634 ms) : 18449, 18818
.   : milestone, 18634,
tracing (18.743 ms) : 18555, 18930
.   : milestone, 18743,
section candidate
no_agent (18.585 ms) : 18393, 18777
.   : milestone, 18585,
appsec (18.922 ms) : 18731, 19112
.   : milestone, 18922,
code_origins (18.155 ms) : 17972, 18338
.   : milestone, 18155,
iast (18.685 ms) : 18499, 18872
.   : milestone, 18685,
profiling (20.501 ms) : 20294, 20708
.   : milestone, 20501,
tracing (17.86 ms) : 17686, 18034
.   : milestone, 17860,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	19.387 ms [19.191 ms, 19.582 ms]	-
appsec	18.72 ms [18.531 ms, 18.908 ms]	-666.786 µs (-3.4%)
code_origins	17.711 ms [17.535 ms, 17.888 ms]	-1.675 ms (-8.6%)
iast	17.956 ms [17.778 ms, 18.134 ms]	-1.43 ms (-7.4%)
profiling	18.634 ms [18.449 ms, 18.818 ms]	-752.69 µs (-3.9%)
tracing	18.743 ms [18.555 ms, 18.93 ms]	-643.806 µs (-3.3%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	18.585 ms [18.393 ms, 18.777 ms]	-
appsec	18.922 ms [18.731 ms, 19.112 ms]	336.737 µs (1.8%)
code_origins	18.155 ms [17.972 ms, 18.338 ms]	-430.348 µs (-2.3%)
iast	18.685 ms [18.499 ms, 18.872 ms]	100.225 µs (0.5%)
profiling	20.501 ms [20.294 ms, 20.708 ms]	1.916 ms (10.3%)
tracing	17.86 ms [17.686 ms, 18.034 ms]	-725.653 µs (-3.9%)

Request duration reports for insecure-bank

gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~9eb11aac61, baseline=1.62.0-SNAPSHOT~b266e2d0c2
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.241 ms) : 1228, 1254
.   : milestone, 1241,
iast (3.268 ms) : 3221, 3315
.   : milestone, 3268,
iast_FULL (5.975 ms) : 5914, 6036
.   : milestone, 5975,
iast_GLOBAL (3.678 ms) : 3616, 3739
.   : milestone, 3678,
profiling (2.13 ms) : 2109, 2151
.   : milestone, 2130,
tracing (1.934 ms) : 1917, 1950
.   : milestone, 1934,
section candidate
no_agent (1.283 ms) : 1270, 1296
.   : milestone, 1283,
iast (3.42 ms) : 3368, 3472
.   : milestone, 3420,
iast_FULL (6.126 ms) : 6064, 6188
.   : milestone, 6126,
iast_GLOBAL (3.638 ms) : 3580, 3696
.   : milestone, 3638,
profiling (2.219 ms) : 2197, 2241
.   : milestone, 2219,
tracing (1.871 ms) : 1856, 1887
.   : milestone, 1871,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	1.241 ms [1.228 ms, 1.254 ms]	-
iast	3.268 ms [3.221 ms, 3.315 ms]	2.027 ms (163.4%)
iast_FULL	5.975 ms [5.914 ms, 6.036 ms]	4.734 ms (381.4%)
iast_GLOBAL	3.678 ms [3.616 ms, 3.739 ms]	2.437 ms (196.3%)
profiling	2.13 ms [2.109 ms, 2.151 ms]	889.01 µs (71.6%)
tracing	1.934 ms [1.917 ms, 1.95 ms]	692.788 µs (55.8%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	1.283 ms [1.27 ms, 1.296 ms]	-
iast	3.42 ms [3.368 ms, 3.472 ms]	2.137 ms (166.5%)
iast_FULL	6.126 ms [6.064 ms, 6.188 ms]	4.843 ms (377.5%)
iast_GLOBAL	3.638 ms [3.58 ms, 3.696 ms]	2.355 ms (183.6%)
profiling	2.219 ms [2.197 ms, 2.241 ms]	935.903 µs (72.9%)
tracing	1.871 ms [1.856 ms, 1.887 ms]	588.454 µs (45.9%)

Dacapo

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	brian.marks/fix-log-injection-smoke-test-flake
git_commit_date	1775744045	1775836958
git_commit_sha	`b266e2d`	`9eb11aa`
release_version	1.62.0-SNAPSHOT~b266e2d0c2	1.62.0-SNAPSHOT~9eb11aac61

See matching parameters

	Baseline	Candidate
application	biojava	biojava
ci_job_date	1775838899	1775838899
ci_job_id	1586148689	1586148689
ci_pipeline_id	107145233	107145233
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-0-7b269uvy 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-0-7b269uvy 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics.

Execution time for biojava

gantt
    title biojava - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~9eb11aac61, baseline=1.62.0-SNAPSHOT~b266e2d0c2
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.59 s) : 15590000, 15590000
.   : milestone, 15590000,
appsec (14.812 s) : 14812000, 14812000
.   : milestone, 14812000,
iast (17.946 s) : 17946000, 17946000
.   : milestone, 17946000,
iast_GLOBAL (18.062 s) : 18062000, 18062000
.   : milestone, 18062000,
profiling (15.24 s) : 15240000, 15240000
.   : milestone, 15240000,
tracing (14.828 s) : 14828000, 14828000
.   : milestone, 14828000,
section candidate
no_agent (15.138 s) : 15138000, 15138000
.   : milestone, 15138000,
appsec (15.346 s) : 15346000, 15346000
.   : milestone, 15346000,
iast (18.19 s) : 18190000, 18190000
.   : milestone, 18190000,
iast_GLOBAL (17.891 s) : 17891000, 17891000
.   : milestone, 17891000,
profiling (14.735 s) : 14735000, 14735000
.   : milestone, 14735000,
tracing (14.908 s) : 14908000, 14908000
.   : milestone, 14908000,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	15.59 s [15.59 s, 15.59 s]	-
appsec	14.812 s [14.812 s, 14.812 s]	-778.0 ms (-5.0%)
iast	17.946 s [17.946 s, 17.946 s]	2.356 s (15.1%)
iast_GLOBAL	18.062 s [18.062 s, 18.062 s]	2.472 s (15.9%)
profiling	15.24 s [15.24 s, 15.24 s]	-350.0 ms (-2.2%)
tracing	14.828 s [14.828 s, 14.828 s]	-762.0 ms (-4.9%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	15.138 s [15.138 s, 15.138 s]	-
appsec	15.346 s [15.346 s, 15.346 s]	208.0 ms (1.4%)
iast	18.19 s [18.19 s, 18.19 s]	3.052 s (20.2%)
iast_GLOBAL	17.891 s [17.891 s, 17.891 s]	2.753 s (18.2%)
profiling	14.735 s [14.735 s, 14.735 s]	-403.0 ms (-2.7%)
tracing	14.908 s [14.908 s, 14.908 s]	-230.0 ms (-1.5%)

Execution time for tomcat

gantt
    title tomcat - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~9eb11aac61, baseline=1.62.0-SNAPSHOT~b266e2d0c2
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.486 ms) : 1474, 1498
.   : milestone, 1486,
appsec (3.838 ms) : 3615, 4062
.   : milestone, 3838,
iast (2.267 ms) : 2197, 2336
.   : milestone, 2267,
iast_GLOBAL (2.312 ms) : 2242, 2382
.   : milestone, 2312,
profiling (2.101 ms) : 2045, 2156
.   : milestone, 2101,
tracing (2.08 ms) : 2026, 2134
.   : milestone, 2080,
section candidate
no_agent (1.486 ms) : 1474, 1497
.   : milestone, 1486,
appsec (3.838 ms) : 3616, 4060
.   : milestone, 3838,
iast (2.278 ms) : 2208, 2348
.   : milestone, 2278,
iast_GLOBAL (2.321 ms) : 2250, 2392
.   : milestone, 2321,
profiling (2.095 ms) : 2040, 2150
.   : milestone, 2095,
tracing (2.082 ms) : 2028, 2136
.   : milestone, 2082,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.486 ms [1.474 ms, 1.498 ms]	-
appsec	3.838 ms [3.615 ms, 4.062 ms]	2.352 ms (158.3%)
iast	2.267 ms [2.197 ms, 2.336 ms]	780.725 µs (52.5%)
iast_GLOBAL	2.312 ms [2.242 ms, 2.382 ms]	826.15 µs (55.6%)
profiling	2.101 ms [2.045 ms, 2.156 ms]	614.652 µs (41.4%)
tracing	2.08 ms [2.026 ms, 2.134 ms]	593.804 µs (40.0%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.486 ms [1.474 ms, 1.497 ms]	-
appsec	3.838 ms [3.616 ms, 4.06 ms]	2.352 ms (158.3%)
iast	2.278 ms [2.208 ms, 2.348 ms]	792.12 µs (53.3%)
iast_GLOBAL	2.321 ms [2.25 ms, 2.392 ms]	835.051 µs (56.2%)
profiling	2.095 ms [2.04 ms, 2.15 ms]	609.511 µs (41.0%)
tracing	2.082 ms [2.028 ms, 2.136 ms]	595.779 µs (40.1%)

The `check raw file injection` test flakes across 11+ logging backend variants. CI Visibility data shows the failure is bimodal — successful runs complete in 3-9s, but failures sit at exactly 30s (the PollingConditions timeout) with traceCount=0. Nothing in between. This means the process either works or is totally broken — no amount of timeout increase will help. The current test is blind during the 30s wait — it just polls traceCount with no diagnostics when the process crashes or hangs. Changes: - Add `waitForTraceCountAlive` that checks process liveness on every poll iteration. If the process dies, it fails immediately with the exit code, RC poll count, and last 20 lines of process output. - On timeout, enrich the error with diagnostic state (process alive?, traceCount, RC polls received, last 30 lines of output) so the next CI failure tells us whether it's a crash, a hang, or a connectivity issue. - Reorder `waitForTraceCount(4)` before `waitFor` to confirm all traces are delivered while the process is still alive. - Assert `waitFor` return value for a clear error if the process hangs. tag: no release notes Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

The liveness check fired before the trace count check, so a normal process exit after delivering all traces was treated as a failure. Check traceCount >= count first and return early if satisfied. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

PollingConditions.eventually only retries AssertionError. The liveness check was throwing AssertionError, so a dead process still waited the full 30s timeout. Switch to RuntimeException so it propagates immediately. Also narrow the catch from Throwable to AssertionError. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

bm1549 added type: bug Bug report and fix comp: core Tracer core tag: no release notes Changes to exclude from release notes tag: ai generated Largely based on code generated by an AI or LLM labels Apr 9, 2026

bm1549 changed the title ~~Fix log injection smoke test flakiness from startup timeout~~ Diagnose log injection smoke test flakiness instead of masking it Apr 9, 2026

bm1549 and others added 2 commits April 10, 2026 11:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Diagnose log injection smoke test flakiness instead of masking it#11075

Diagnose log injection smoke test flakiness instead of masking it#11075
bm1549 wants to merge 4 commits intomasterfrom
brian.marks/fix-log-injection-smoke-test-flake

bm1549 commented Apr 9, 2026 •

edited

Loading

Uh oh!

pr-commenter bot commented Apr 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bm1549 commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What Does This Do

Motivation

Additional Notes

Contributor Checklist

Uh oh!

pr-commenter bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Startup

Parameters

Summary

Load

Parameters

Summary

Dacapo

Parameters

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bm1549 commented Apr 9, 2026 •

edited

Loading

pr-commenter bot commented Apr 9, 2026 •

edited

Loading